Competing risk models to evaluate the factors for time to loss to follow-up among tuberculosis patients at Ambo General Hospital

Background A major challenge for most tuberculosis programs is the inability of tuberculosis patients to complete treatment for one reason or another. Failure to complete the treatment contributes to the emergence of multidrug-resistant TB. This study aimed to evaluate the risk factors for time to loss to follow-up treatment by considering death as a competing risk event among tuberculosis patients admitted to directly observed treatment short course at Ambo General Hospital, Ambo, Ethiopia. Methods Data collected from 457 tuberculosis patients from January 2018 to January 2022 were used for the analysis. The cause-specific hazard and sub-distribution hazard models for competing risks were used to model the outcome of interest and to identify the prognostic factors associated to treatment loss to follow-up. Loss to follow-up was used as an outcome measure and death as a competing event. Results Of the 457 tuberculosis patients enrolled, 54 (11.8%) were loss to follow-up their treatment and 33 (7.2%) died during the follow up period. The median time of loss to follow-up starting from the date of treatment initiation was 4.2 months. The cause-specific hazard and sub-distribution hazard models revealed that sex, place of residence, HIV status, contact history, age and baseline weights of patients were significant risk factors associated with time to loss to follow-up treatment. The findings showed that the estimates of the covariates effects were different for the cause specific and sub-distribution hazard models. The maximum relative difference observed for the covariate between the cause specific and sub-distribution hazard ratios was 12.2%. Conclusions Patients who were male, rural residents, HIV positive, and aged 41 years or older were at higher risk of loss to follow-up their treatment. This underlines the need that tuberculosis patients, especially those in risk categories, be made aware of the length of the directly observed treatment short course and the effects of discontinuing treatment. Supplementary Information The online version contains supplementary material available at 10.1186/s13690-023-01130-2.


Competing risk for analyzing survival data
Under the frame work of competing risks modeling the two most commonly used approaches are the CSH Cox approach and Fine-Gray proportional SDH model [1]. In the standard survival analysis, Cox proportional hazards model is a semiparametric model in which dependence on the explanatory variables is modelled explicitly but no specific probability distribution is assumed for survival times. An analogous Cox regression approach can also be applied using CSHs regression when competing risks are present. In the analysis of times to a certain event k, the CSH is the instantaneous rate of experiencing cause k amongst those who are event-free (i.e., have not yet had cause k or any of the competing events). A straightforward way of applying this CSH approach is to fit a separate Cox model for each cause, censoring any competing events at their time of occurrence.
The one-to-one correspondence between hazard and survival that exists in the standard survival analysis does not necessarily hold when competing risks are present [2]. As a consequence, the effect of a covariate on the CSH for a particular cause may be different from its corresponding effect on the probability of the event occurring. To overcome the related problems with interpretation with the CSHs approach, Fine and Gray [1] introduced the concept of a SDH that has a one-to-one correspondence with the cumulative incidence of the event. The SDH can be modeled in a proportional hazard framework using the CIF. The CIF F k (t) estimates the probability of failing from cause k before a given time t to provide information for a certain population or to compare a discrete number of subgroups descriptively, and denoted by with T and D being random variables representing the time to the first observed event and the type of event, respectively. Also, the CIF in a population or in subgroups of interest when competing risks are available can be estimated usinĝ From expression (2),Ŝ(t) is the estimator for the overall survivor function at time t including all types of event and t i denotes the i th ordered event time.λ cs k (t) is the CSH rate given by The CSH rate in expression (3) can be estimated bŷ where d ki is the number of failures from type k at time t i and n i the risk set at time t i , i.e. the number of patients who were not censored and have not failed from any cause up to time t i .

Regression models for competing risks
Cause-specific Hazard Regression The CSH regression model is used to estimate the effect of covariates on the rate of occurrence of the outcome in those subjects who are currently event free. If the covariate is continuous or the simultaneous effect of several covariates on causespecific failure is of interest, a competing risks analogue of a CPHs model seems the most logical choice. The estimation of covariate effects on the CSH rate was proposed by Prentice et al. [3]. In a semi-parametric Cox regression approach [4] the CSH of cause k for subject with a covariate vector X is modeled using where λ cs k,0 (t) is the baseline CSH of cause k and β k is a vector of unknown parameters to be estimated for each outcome variable provided k is the cause of failure at time t.
The CSHs completely determine the competing risks process [5]. Hence, the CIFs can be estimated from separate CSH regression models for all types of event, for instance, the CIF for the k th out of K events is where Λ cs k (s|X) denotes the cumulative CSH rate for event k at time t for a given matrix of covariates X defined by Λ cs l (s|X) = t 0 λ cs k (s|X).

Sub-distribution Hazard Regression-the Fine and Gray model
The Fine-Gray Model allows us to estimate the effect of covariates on the absolute risk of the outcome over time [1]. For event k, the SDH is the probability of a subject to fail from cause k in an inconsiderable small time interval ∆t, given no event until time t or an event other than k occurred for a subject before time t [6], Individuals failing before time t other than the cause of interest remain in the risk set for all future failure times. For events recorded in discrete time with no censoring, the SDH at time t i can be estimated bŷ where d ki denotes the number of failures of type k at time t i and n * i the modified risk set including all subjects who did not experience any event until time t i and all subjects that failed before t i from a cause other than k. The CIF is used to model the risk of experiencing a specific event in subjects who have not yet experienced this event. It denotes the instantaneous risk of failure from the k th event in subjects who have not yet experienced an event of type k. The basic difference between the two hazards is related to the risk sets. The risk set is the set of individuals/subjects under investigation and vulnerable to the event. In CIF, the risk set includes those who are currently alive as well as those who have previously experienced a competing event, while the risk set of the CSH function only considers those who are currently event free.
The SDH (hazard of the cumulative incidence) for each cause supposed as the hazard for an individual who either fails from cause k or does not can be written as Under the proportional hazard, Fine-Gray Model [7] can be specified as where λ * k,0 (t) denotes the baseline SDH for the cause of k. The SDH rates are assumed to be proportional for the included covariates and also directly linked to the CIF in a way known from the standard survival analysis with one possible endpoint given by Hence, the CIF for the event of interest can be estimated directly from the regression coefficients obtained by a Fine and Gray model without explicit consideration of the covariate effects on competing events.

Model Comparison for the CSH and SDH regressions
In the presence of two possible types of failure, the relationship between causespecific and sub-distribution hazards [5] can be derived using expressions (6) and (11) to get λ cs 1 (t|X) = (1 + where S(t|X) denotes the probability of being free of any event until time t given X. Also, for each covariate the relative difference (RD) of the hazard ratios [8] can be computed by

Estimation of parameters
The partial likelihood are employed to estimate the coefficients vector [9]. Let 0 < t 1k < t 2k < ... < t uk be ordered distinct time points at which failures of any causes occur for the risk k. Assume that only one failure can happen at each failure time, i.e. there are no tied failure times in the data. The partial likelihood [9] for specific hazard k is given by where n k is the number of individuals in specific hazard k, X ik is a vector of covariates for individual i specific to k-type risk at time t, the vector β k represents the regression coefficients of cause k to be estimated. Since the same variables could have different effects on the different risks, β k is independent of each other for each k and the set of individuals at risk at time t ik is R(t ik ) = {l|t lk ≥ t ik }. Using expression (14), the overall partial likelihood function is Estimation of parameters in the Fine-Gray model uses the partial likelihood approach by incorporating weights [10] given by whereβ 1β2 . . .β p are estimation of the regression parameters for event type k, X i1 X i2 . . . X ip are predictor variables for subject i event type k. A collection of risks R(t i ) defined by adding weights, which is based on non-censored subjects and can be concluded that collection of risks are formed based on the event of interest as R(t i ) = {l; T l ≥ t or (T l ≤ t) and an individual in a competing risk}.
Also, w il in expression (16) is the weight of the subject i in the event of interest defined by whereĜ(t i ) is estimate of survival function from survival time i subject,Ĝ(min(t i , t l )) is estimation of the survival function from the minimum value between the survival time of the subject i and the subject on the event l or event of interest. The weight will be 1 if the subject has a survival time t i ≤ t l and has a value ≤ 1 if t l > t i [11]. Once the likelihood is formulated, the goal is to choose the values of parameters that maximize the likelihood.

Model Diagnostics
The main assumption when modeling survival data is the proportionality of hazards. When the Fine-Gray model is used, the hazards of the CIF must be proportional, whereas, in the CPH model, it is the CSHs that need to be proportional [11]. The proportionality assumption is the most common in competing risk regression models, which consider the sub-distribution with covariates X as a constant shift on the complementary log-log scale from a baseline sub-distribution function. If the curves do not cross with each other, then we say that the model does not violate the assumption of proportionality [12].
Once the data arrangement was accomplished all the statistical analysis was done using the survival and cmprsk packages of the R statistical software (version 4.2.1).  Author details